Observability & Monitoring
Complete guide for monitoring and observability setup for Exivity on Kubernetes.
Prometheus, Prometheus Operator, and Grafana are third-party observability products. Exivity provides metrics endpoints, dashboard examples, and alert examples, but you are responsible for operating and supporting your monitoring platform.
Exivity Health Monitoring
Exivity provides built-in health monitoring capabilities through:
Built-in Health Endpoints
Every Exivity service uses health endpoints at /healthz on port 8000 for Kubernetes probes:
- Liveness probes - Detect if the service needs restarting
- Readiness probes - Determine if the service can accept traffic
- Metrics endpoints - Expose Prometheus-compatible metrics at
/metricswhenprometheus.metricServer.enabledis set totrue
Exivity-Specific Metrics
Exivity exposes these custom metrics for monitoring:
| Metric Name | Description | Values | Usage |
|---|---|---|---|
📈 exivity_up | Service health status | 1 (healthy), 0 (down) | Overall service availability |
📈 exivity_command_healthy | Individual command health | 1 (healthy), 0 (unhealthy) | Component-level monitoring |
💾 exivity_nfs_dir_writable | NFS directory write status | 1 (writable), 0 (not writable) | Storage health monitoring |
Configuration
Enable Exivity Monitoring
Configure monitoring in your values.yaml:
# Enable health probes (enabled by default)
probes:
livenessProbe:
enabled: true
initialDelaySeconds: 3
periodSeconds: 30
failureThreshold: 120
readinessProbe:
enabled: true
initialDelaySeconds: 3
periodSeconds: 30
failureThreshold: 60
# Enable Prometheus metrics collection
prometheus:
metricServer:
enabled: true
serviceMonitor:
enabled: true # Creates ServiceMonitor for prometheus-operator
Monitoring Dashboard and Alerts
Exivity provides ready-to-use monitoring configurations for Kubernetes deployments using Prometheus and Grafana. This allows you to monitor service health, NFS storage, and readiness directly in your cluster.
Grafana Dashboard
A ready-to-use Grafana dashboard is provided:
- File:
exivity-health.grafana.json(download)
How to use:
- Import this JSON file into your Grafana instance (see Grafana import docs)
- The dashboard visualizes Exivity service health, NFS writability, and command status using Prometheus metrics.
Prometheus Alert Rules
A set of Prometheus alert rules is provided for Exivity:
- File:
readiness-probe.rules.yaml(download) - Alerts included:
- ServiceDown - Critical alert when
exivity_up == 0for 10 minutes - NfsDirNotWritable - Critical alert when
exivity_nfs_dir_writable == 0for 10 minutes - CommandHealthy - Critical alert when
exivity_command_healthy == 0for 10 minutes
- ServiceDown - Critical alert when
How to use:
- Add this YAML file to your Prometheus alerting rules configuration
Requirements
- Prometheus must be scraping the Exivity metrics endpoints
- Prometheus Operator must be installed if you enable
prometheus.metricServer.serviceMonitor.enabled - Grafana must be connected to your Prometheus data source
- For more details on setup, see the Prometheus and Grafana documentation